Blob storage artifact-data keys, binary sanitizer, and observability stripping#2744
Blob storage artifact-data keys, binary sanitizer, and observability stripping#2744mike-inkeep wants to merge 12 commits intomainfrom
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
There was a problem hiding this comment.
PR Review Summary
(6) Total Issues | Risk: Medium
🟠⚠️ Major (2) 🟠⚠️
🟠 1) artifact-binary-sanitizer.ts:53 Missing error handling for storage upload failures
Issue: The uploadInlinePart function calls storage.upload() without a try-catch block. If the blob storage upload fails (network error, quota exceeded, permission denied), the promise rejection propagates and could fail the entire artifact processing pipeline.
Why: The existing pattern in image-upload.ts (lines 90-98) gracefully catches upload failures and drops the problematic part while preserving the rest of the content. This inconsistency creates a reliability gap where a transient storage failure could cause complete artifact processing failure rather than graceful degradation.
Fix: Add error handling around the upload call that either (1) returns the original part with base64 data intact as a fallback, or (2) replaces the data with an error placeholder. The first option preserves data at the cost of larger observability payloads; the second maintains observability cleanliness but loses the data.
Refs:
🟠 2) sanitizeArtifactBinaryData Exported function is not used in production code
Issue: The PR adds two functions: stripBinaryDataForObservability (used in 4 places in AgentSession.ts) and sanitizeArtifactBinaryData (an async function that uploads binary data to blob storage). However, sanitizeArtifactBinaryData is exported and tested but never imported or called anywhere in the codebase outside its test file.
Why: This means the artifact-data storage key category is added but no code path actually uploads artifacts using it. The infrastructure exists but is not wired up to any data flow. This may be intentional preparatory work, but it's unclear from the PR.
Fix: Either (1) defer adding sanitizeArtifactBinaryData until the consuming code is ready, (2) document in the PR description that this is preparatory infrastructure with follow-up work planned, or (3) wire up the function to the appropriate data flow (e.g., artifact persistence in ArtifactService).
Refs:
Inline Comments:
- 🟠 Major:
artifact-binary-sanitizer.ts:53Missing error handling for storage upload failures
🟡 Minor (4) 🟡
🟡 1) storage-keys.ts:37-45 Missing unit tests for new artifact-data storage key category
Issue: The existing storage-keys.test.ts tests buildStorageKey for the media category but was not updated to test the new artifact-data category.
Why: The storage key format is critical for data integrity and multi-tenant isolation. A regression in the key format could cause lookup failures or tenant data leakage.
Fix: Add a test case in storage-keys.test.ts:
it('builds versioned artifact-data storage key', () => {
const key = buildStorageKey({
category: 'artifact-data',
tenantId: 'tenant-1',
projectId: 'project-1',
artifactId: 'artifact-1',
contentHash: 'abc123',
ext: 'png',
});
expect(key).toBe('v1/t_tenant-1/artifact-data/p_project-1/a_artifact-1/sha256-abc123.png');
});🟡 2) artifact-binary-sanitizer.ts:27 Magic number threshold for binary detection
Issue: The 100 character threshold for detecting binary data is hardcoded without explanation.
Why: This represents only ~75 decoded bytes. Without documentation, it's unclear if this was deliberately chosen or if a larger threshold would be more appropriate.
Fix: Extract to a named constant: const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes
🟡 3) artifact-binary-sanitizer.test.ts Missing test for data: URI handling
Issue: The implementation checks !v.data.startsWith('data:') but there's no explicit test.
Why: Data URIs are common for inline images; without a test, this check could be accidentally removed.
Fix: Add test case for data URIs.
🟡 4) artifact-binary-sanitizer.test.ts Missing test for upload failure scenario
Issue: Tests mock upload to always succeed but never test the failure path.
Why: Masks the missing error handling in the implementation.
Fix: Add failure test case.
Inline Comments:
- 🟡 Minor:
artifact-binary-sanitizer.ts:19Magic number threshold lacks documentation - 🟡 Minor:
artifact-binary-sanitizer.test.ts:109Missing test for upload failure handling - 🟡 Minor:
artifact-binary-sanitizer.test.ts:57Missing test for data: URI handling
💭 Consider (2) 💭
💭 1) storage-keys.ts Missing companion parse/prefix functions for artifact-data category
Issue: The media category has parseMediaStorageKey and buildMediaStorageKeyPrefix helpers, but artifact-data only has the buildStorageKey case.
Why: May be needed later for cleanup, authorization checks, or debugging. If not needed, this is acceptable.
Fix: Add a comment explaining why parse/prefix helpers are not needed, or add them for consistency.
💭 2) AgentSession.ts Pretty-print JSON in span attributes
Issue: Uses JSON.stringify(..., null, 2) for span attributes which adds unnecessary whitespace.
Why: Span attributes are stored and indexed; compact format reduces storage overhead.
Fix: Remove null, 2 arguments: JSON.stringify(stripBinaryDataForObservability(artifactData.data))
💡 APPROVE WITH SUGGESTIONS
Summary: This PR adds valuable infrastructure for handling binary data in artifacts. The stripBinaryDataForObservability function is correctly integrated and will improve observability by preventing large binary payloads in spans. However, two issues warrant attention: (1) the upload function lacks error handling unlike the existing image-upload.ts pattern, and (2) sanitizeArtifactBinaryData appears to be unused — clarification on whether this is intentional preparatory work would be helpful. Adding tests for the new storage key category would also strengthen the PR.
Discarded (3)
| Location | Issue | Reason Discarded |
|---|---|---|
artifact-binary-sanitizer.test.ts |
Missing test for missing mimeType fallback | Low severity (INFO level); the behavior is implicitly covered and mimeType presence is common |
artifact-binary-sanitizer.test.ts |
Missing boundary test at exactly 100 chars | Nice-to-have but unlikely to cause real bugs; current tests cover above/below threshold adequately |
AgentSession.ts spans |
Redundant JSON.stringify pretty-print |
Moved to Consider as it's a minor optimization, not a correctness issue |
Reviewers (6)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-precision |
3 | 1 | 1 | 0 | 0 | 0 | 1 |
pr-review-tests |
6 | 2 | 0 | 0 | 2 | 0 | 2 |
pr-review-sre |
4 | 1 | 1 | 0 | 1 | 0 | 1 |
pr-review-consistency |
2 | 1 | 0 | 0 | 0 | 0 | 1 |
pr-review-standards |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-types |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 15 | 5 | 2 | 0 | 3 | 0 | 5 |
Note: Some findings were merged across reviewers (e.g., error handling raised by both tests and SRE reviewers).
| ext, | ||
| }); | ||
|
|
||
| await storage.upload({ key, data: buffer, contentType: mimeType }); |
There was a problem hiding this comment.
🟠 MAJOR: Missing error handling for storage upload failures
Issue: The uploadInlinePart function calls storage.upload() without a try-catch. If the upload fails (network error, quota exceeded, permission denied), the error propagates and could fail the entire artifact processing pipeline.
Why: The existing pattern in image-upload.ts (lines 90-98) catches upload failures gracefully. This inconsistency could cause unhandled errors and partial artifact state when storage is temporarily unavailable.
Fix: Consider wrapping the upload in a try-catch:
try {
await storage.upload({ key, data: buffer, contentType: mimeType });
return { ...part, data: toBlobUri(key) };
} catch (error) {
// Option 1: Return original part (fallback to base64)
return part;
// Option 2: Return placeholder
// return { ...part, data: `[upload failed, binary ~${buffer.length} bytes]` };
}Refs:
| [key: string]: unknown; | ||
| }; | ||
|
|
||
| const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]'; |
There was a problem hiding this comment.
🟡 Minor: Magic number threshold lacks documentation
Issue: The 100 character threshold for detecting binary data is hardcoded without explanation. This represents only ~75 decoded bytes, which may be smaller than intended.
Why: Without documentation, it's unclear whether this threshold was deliberately chosen or if a larger value (e.g., 1KB) would better distinguish binary content from short encoded strings.
Fix:
| const CIRCULAR_REFERENCE_PLACEHOLDER = '[Circular Reference]'; | |
| const BINARY_DATA_MIN_LENGTH = 100; // ~75 decoded bytes; data below this is likely metadata, not binary payload |
Then use the constant in the check below.
| download: vi.fn(), | ||
| delete: vi.fn(), | ||
| }); | ||
| }); |
There was a problem hiding this comment.
🟡 Minor: Missing test for upload failure handling
Issue: The test suite mocks upload to always succeed but never tests the failure path.
Why: Given that image-upload.test.ts has a test for 'drops file part when storage.upload throws' (line 148), this gap masks the missing error handling in the implementation.
Fix: Add a test case:
it('propagates upload errors when blob storage fails', async () => {
mockUpload.mockRejectedValueOnce(new Error('Storage quota exceeded'));
const input = { type: 'image', data: LARGE_BASE64, mimeType: 'image/png' };
await expect(sanitizeArtifactBinaryData(input, CTX)).rejects.toThrow('Storage quota exceeded');
});Refs:
| it('leaves http URLs untouched', () => { | ||
| const input = { type: 'image', data: 'https://example.com/img.png', mimeType: 'image/png' }; | ||
| const result = stripBinaryDataForObservability(input) as any; | ||
| expect(result.data).toBe('https://example.com/img.png'); |
There was a problem hiding this comment.
🟡 Minor: Missing test for data: URI handling
Issue: Line 30 of the implementation checks !v.data.startsWith('data:') but there's no explicit test verifying data URIs are left untouched.
Why: Data URIs are a common format for inline images. Without a test, a refactor could accidentally remove the check, causing data URIs to be re-uploaded unnecessarily.
Fix: Add test cases:
it('leaves data: URIs untouched', () => {
const dataUri = 'data:image/png;base64,' + LARGE_BASE64;
const input = { type: 'image', data: dataUri, mimeType: 'image/png' };
const result = stripBinaryDataForObservability(input) as any;
expect(result.data).toBe(dataUri);
});…artifact_binary_sanitizer
|
TL;DR — Adds infrastructure for offloading inline binary data (images, files) from artifact payloads to blob storage, and strips that binary data from observability spans so traces and logs don't balloon with base64 content. Key changes
Summary | 6 files | 12 commits | base: Content-addressed blob storage keys for artifact data
The Binary sanitizer for artifact data
The detection heuristic ( Upload failures are caught and logged with structured context (tenant, project, artifact, mime type, size) — the original inline data is returned unchanged, keeping the pipeline resilient to storage outages.
Observability stripping in
|
There was a problem hiding this comment.
PR Review Summary
(6) Total Issues | Risk: Medium
Delta Since Last Review
The only change since the prior review is a single import path fix:
-import { getExtensionFromMimeType } from '@inkeep/agents-core/constants/allowed-image-formats';
+import { getExtensionFromMimeType } from '@inkeep/agents-core/constants/allowed-file-formats';This is a correct fix — the module was renamed from allowed-image-formats to allowed-file-formats in main. ✅
🕐 Pending Recommendations (6)
The following issues were raised in the prior review and remain unaddressed:
Major
- 🟠
artifact-binary-sanitizer.ts:53Missing error handling for storage upload failures —uploadInlinePartlacks try-catch aroundstorage.upload(), inconsistent withfile-upload.tspattern - 🟠
sanitizeArtifactBinaryDataExported function is unused in production code — infrastructure exists but not wired to any data flow
Minor
- 🟡
storage-keys.ts:37-45Missing unit tests for newartifact-datastorage key category instorage-keys.test.ts - 🟡
artifact-binary-sanitizer.ts:27Magic number threshold (100) for binary detection lacks documentation - 🟡
artifact-binary-sanitizer.test.tsMissing test fordata:URI handling - 🟡
artifact-binary-sanitizer.test.tsMissing test for upload failure scenario
🚫 REQUEST CHANGES
Summary: The delta since the prior review is a trivial import path fix which is correct. However, the 2 Major and 4 Minor issues identified in the previous review remain unaddressed. The most critical gaps are: (1) missing error handling for storage upload failures which creates inconsistency with the existing file-upload.ts pattern and could cause complete artifact processing failure on transient storage errors, and (2) sanitizeArtifactBinaryData being exported but unused, suggesting incomplete integration or preparatory infrastructure that should be documented.
Reviewers (0)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
| N/A — delta review | 0 | 0 | 0 | 0 | 0 | 6 | 0 |
Note: Domain-specific reviewers were not dispatched for this re-review because the delta consists only of a single import path fix. All 6 pending recommendations are carried forward from the prior review.
Ito Test Report ❌17 test cases ran. 1 failed, 16 passed. Overall, 17 test cases were executed with 16 passing and 1 failing, showing strong stability across playground chat submission/double-send gating, traces conversation/history routing, outage and navigation resilience, artifact sanitization/copy-download behavior, mobile rendering, session invalidation, and authorization/large-payload protections. The key issue is a Medium-severity traces UI bug where untrusted trace text is rendered via markdown (Streamdown) in timeline/detail views, causing script-tag rendering instability instead of consistently inert plain-text handling and reducing observability reliability. ❌ Failed (1)
🟠 XSS payload in artifact fields is not executed
Relevant code:
<button
type="button"
onClick={onSelect}
className={`flex items-center gap-1 group cursor-pointer transition-colors duration-200 ${textColorClass}`}
title="Click to view details"
>
<span className="font-medium">
<Streamdown>{activity.description}</Streamdown>
</span>
<LabeledBlock label="Message content">
<Bubble className="break-words">
<Streamdown>{a.messageContent || 'Message content not available'}</Streamdown>
</Bubble>
</LabeledBlock>
const toolCall = getString(span, SPAN_KEYS.SPAN_ID, '');
activities.push({
id: toolCall,
type: ACTIVITY_TYPES.TOOL_CALL,
toolName: name,
description: hasError && statusMessage ? `Tool ${name} failed` : `Called ${name}`,
timestamp: span.timestamp,✅ Passed (16)Commit: Tell us how we did: Give Ito Feedback |
This function is intentionally exported for use in artifact persistence (see PR #2745). The observability-only stripping path is `stripBinaryDataForObservability`. export async function sanitizeArtifactBinaryData()
There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
Delta Since Last Review
The author addressed all 6 prior findings from the previous review:
| Prior Finding | Status | Implementation |
|---|---|---|
| 🟠 Missing error handling for upload | ✅ FIXED | Try-catch added with graceful fallback (lines 58-73) |
🟠 Unused sanitizeArtifactBinaryData |
✅ Understood | Intentional preparatory infrastructure |
🟡 Missing artifact-data storage key test |
✅ FIXED | Test added (storage-keys.test.ts:61-72) |
| 🟡 Magic number (100) threshold | ✅ RESOLVED | Changed to > 1 — now detects all binary data |
🟡 Missing data: URI test |
✅ FIXED | Tests added for both stripping (lines 53-61) and sanitization (lines 193-206) |
| 🟡 Missing upload failure test | ✅ FIXED | Test added (lines 208-214) |
New Changes in Delta
The delta commit (f5f09e008) adds:
parseDataUriintegration — Data URIs are now parsed and their base64 payloads extracted (mimeType from URI takes precedence)- Robust error handling — Upload failures log comprehensive context and gracefully fall back to original inline data
- Pretty-print removal — Removed
null, 2fromJSON.stringifycalls in span attributes (compact format) - Comprehensive test coverage — All prior gaps addressed with well-structured tests
💭 Consider (1) 💭
Inline Comments:
- 💭 Consider:
artifact-binary-sanitizer.test.ts:22Missing boundary test for threshold change from> 100to> 1
🕐 Pending Recommendations (1)
- 🟠
sanitizeArtifactBinaryDataFunction is exported and tested but unused in production — understood as preparatory infrastructure for a follow-up PR that wires it into artifact persistence
✅ APPROVE
Summary: Excellent follow-up work. All prior feedback has been addressed with well-implemented solutions. The error handling pattern (try-catch with graceful fallback to original data) is consistent with existing patterns in the codebase. The sanitizeArtifactBinaryData function being unused is understood as preparatory infrastructure — the observability stripping via stripBinaryDataForObservability is the production-ready piece that's correctly integrated into AgentSession.ts. Ship it! 🚀
Discarded (2)
| Location | Issue | Reason Discarded |
|---|---|---|
artifact-binary-sanitizer.ts:45 |
Buffer.from doesn't validate base64 input |
Edge case — image/file parts from LLM tool results should be valid base64; any corruption would be visible when artifact fails to render |
artifact-binary-sanitizer.test.ts:193 |
Missing test for malformed data URIs | Low priority edge case — parseDataUri returns null, code safely falls back to treating entire string as base64 |
Reviewers (3)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-standards |
0 | 0 | 0 | 0 | 0 | 0 | 0 |
pr-review-tests |
2 | 0 | 0 | 0 | 1 | 0 | 1 |
pr-review-errors |
3 | 0 | 0 | 0 | 0 | 0 | 1 |
| Total | 5 | 0 | 0 | 0 | 1 | 0 | 2 |
Note: Most reviewer findings were INFO-level acknowledgments of correctly implemented prior feedback.
| ), | ||
| })); | ||
|
|
||
| const SAMPLE_BASE64 = Buffer.from('sample image bytes').toString('base64'); |
There was a problem hiding this comment.
💭 Consider: Missing boundary test for threshold change
Issue: The threshold changed from > 100 to > 1, but no test verifies this boundary. SAMPLE_BASE64 (~24 chars) passes both thresholds.
Why: Without a boundary test, a future refactor could accidentally change > 1 to >= 1 or back to > 100 without test failure.
Fix: Consider adding boundary tests:
it('treats 2-char data string as binary part', () => {
const input = { type: 'image', data: 'AA', mimeType: 'image/png' };
const result = stripBinaryDataForObservability(input) as any;
expect(result.data).toMatch(/^\[binary data/);
});
it('does not treat 1-char data string as binary part', () => {
const input = { type: 'image', data: 'A', mimeType: 'image/png' };
const result = stripBinaryDataForObservability(input) as any;
expect(result.data).toBe('A');
});There was a problem hiding this comment.
PR Review Summary
(0) Total Issues | Risk: Low
Delta Since Last Review
The only change since the prior review is commit 7dd41d83b ("Remove unnecessary check"):
- v.data.length > 1 &&This removes the minimum length check from isInlineBinaryPart, so now any non-empty string data (that isn't a blob:// or http URL) is treated as a binary part.
Analysis
Previous state: v.data.length > 1 — strings of 0 or 1 characters were excluded
New state: No length check — any non-empty string is included
Is this correct? ✅ Yes. The removed check was overly restrictive:
- Single-character base64 data is valid (though rare in practice)
- Empty strings are already implicitly excluded — if
datais'', it would still matchtypeof v.data === 'string', but:stripBinaryDataForObservabilitywould produce[binary data ~0 bytes]— harmlesssanitizeArtifactBinaryDatawould upload a 0-byte file — also harmless (content-hash deduplication handles this)
- The prior "Consider" item about missing boundary tests is now moot — the simpler heuristic doesn't need boundary testing
Impact: Simplifies the detection heuristic with no functional regression. Edge cases (empty/tiny strings) are handled safely.
🕐 Pending Recommendations (1)
- 🟠
sanitizeArtifactBinaryDataFunction is exported and tested but unused in production — understood as preparatory infrastructure for a follow-up PR
✅ APPROVE
Summary: Clean simplification. Removing the > 1 length check makes the binary detection heuristic more straightforward without introducing regressions. The prior review's APPROVE stands — all feedback has been addressed, and this final tweak is a sensible cleanup. 🚀
Reviewers (0)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
| N/A — trivial delta review | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Note: Domain-specific reviewers were not dispatched because the delta consists only of a single line removal. The pending recommendation is carried forward from prior reviews.
Ito Test Report ✅12 test cases ran. 12 passed. Across the unified QA run, all 12 executed test cases passed with 0 failures, and verification found no confirmed or plausible production-code defects. Most importantly, run-route handling worked for inline and nested mixed-content inputs with successful HTTP 200 SSE streaming, external/failed PDF ingestion paths consistently returned controlled HTTP 400 responses with URL token/query/hash redaction and fallback behavior (no 5xx), and traces/artifact and UI stress checks (sanitized payload placeholders with preserved mimeType metadata, clean JSON downloads, mobile/back-forward/deep-link/filter abuse) remained stable and usable despite recoverable telemetry configuration errors. ✅ Passed (12)Commit: Tell us how we did: Give Ito Feedback |
|
@mike-inkeep does this only find binary data within artifacts that have been saved? what about the artifacts themselves being moved to blob storage? |





























Summary
Adds an artifact binary sanitizer that offloads inline binary data (images, files) embedded in artifact payloads to blob storage, and strips those payloads from observability spans before they reach the tracing backend. Also introduces a dedicated
artifact-datastorage key category, fixes Vercel file part mapping to useurlinstead oftext, and addsinlineExternalPdfUrlPartsto support downloading external PDF URLs as inline bytes.Key decisions
Sanitizer runs at persist time, not at generation time. Artifact data is sanitized when it is written to storage, so any consumer reading back blob URIs gets a stable reference. An alternative (sanitizing at read time) was not chosen because it would require every read path to do async work.
Content-hash-based
artifact-datastorage keys. The key structure isv1/t_{tenantId}/artifact-data/p_{projectId}/a_{artifactId}/sha256-{hash}.{ext}. Using a content hash means duplicate binary payloads within the same artifact are deduplicated automatically, and keys are deterministic without needing to generate UUIDs.stripBinaryDataForObservabilityis synchronous and non-destructive. Rather than uploading to blob storage before writing spans (which would add latency and an async dependency to the span-writing hot path), binary parts are replaced with a[binary data ~N bytes, mimeType: X]placeholder. This keeps span sizes small without blocking.inlineExternalPdfUrlPartsfetches PDFs before the upload pipeline. Previously, external PDF URIs infileparts were explicitly blocked. This helper downloads them early (inbuildPersistedMessageContent) and converts to inlinebytesso the rest of the upload pipeline can process them uniformly.Manual QA
No manual QA performed. Consider running
/qato generate and execute a test plan.